[AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs. #145691

kosarev · 2025-06-25T12:32:08Z

No description provided.

llvmbot · 2025-06-25T12:32:42Z

@llvm/pr-subscribers-backend-amdgpu

Author: Ivan Kosarev (kosarev)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/145691.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+8-1)
(modified) llvm/test/CodeGen/AMDGPU/packed-fp32.ll (+2-8)

diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index 8299c0dce1870..2d6afcbfb9447 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -1036,11 +1036,18 @@ void SIFoldOperandsImpl::foldOperand(
     // Grab the use operands first
     SmallVector<MachineOperand *, 4> UsesToProcess(
         llvm::make_pointer_range(MRI->use_nodbg_operands(RegSeqDstReg)));
-    for (auto *RSUse : UsesToProcess) {
+    for (unsigned I = 0; I != UsesToProcess.size(); ++I) {
+      MachineOperand *RSUse = UsesToProcess[I];
       MachineInstr *RSUseMI = RSUse->getParent();
       unsigned OpNo = RSUseMI->getOperandNo(RSUse);
 
       if (SplatVal) {
+        if (RSUseMI->isCopy()) {
+          Register DstReg = RSUseMI->getOperand(0).getReg();
+          append_range(UsesToProcess,
+                       make_pointer_range(MRI->use_nodbg_operands(DstReg)));
+          continue;
+        }
         if (MachineOperand *Foldable =
                 tryFoldRegSeqSplat(RSUseMI, OpNo, SplatVal, SplatRC)) {
           appendFoldCandidate(FoldList, RSUseMI, OpNo, Foldable);
diff --git a/llvm/test/CodeGen/AMDGPU/packed-fp32.ll b/llvm/test/CodeGen/AMDGPU/packed-fp32.ll
index bef38c1a65ef8..4db3f2189bfc3 100644
--- a/llvm/test/CodeGen/AMDGPU/packed-fp32.ll
+++ b/llvm/test/CodeGen/AMDGPU/packed-fp32.ll
@@ -2155,11 +2155,8 @@ define amdgpu_kernel void @fadd_fadd_fsub_0(<2 x float> %arg) {
 ; GFX90A-GISEL-LABEL: fadd_fadd_fsub_0:
 ; GFX90A-GISEL:       ; %bb.0: ; %bb
 ; GFX90A-GISEL-NEXT:    s_load_dwordx2 s[0:1], s[4:5], 0x24
-; GFX90A-GISEL-NEXT:    s_mov_b32 s2, 0
-; GFX90A-GISEL-NEXT:    s_mov_b32 s3, s2
-; GFX90A-GISEL-NEXT:    v_pk_mov_b32 v[0:1], s[2:3], s[2:3] op_sel:[0,1]
 ; GFX90A-GISEL-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX90A-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], v[0:1]
+; GFX90A-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], 0
 ; GFX90A-GISEL-NEXT:    v_mov_b32_e32 v0, v1
 ; GFX90A-GISEL-NEXT:    v_pk_add_f32 v[0:1], v[0:1], 0
 ; GFX90A-GISEL-NEXT:    v_mov_b32_e32 v2, s0
@@ -2170,11 +2167,8 @@ define amdgpu_kernel void @fadd_fadd_fsub_0(<2 x float> %arg) {
 ; GFX942-GISEL-LABEL: fadd_fadd_fsub_0:
 ; GFX942-GISEL:       ; %bb.0: ; %bb
 ; GFX942-GISEL-NEXT:    s_load_dwordx2 s[0:1], s[4:5], 0x24
-; GFX942-GISEL-NEXT:    s_mov_b32 s2, 0
-; GFX942-GISEL-NEXT:    s_mov_b32 s3, s2
-; GFX942-GISEL-NEXT:    v_mov_b64_e32 v[0:1], s[2:3]
 ; GFX942-GISEL-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX942-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], v[0:1]
+; GFX942-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], 0
 ; GFX942-GISEL-NEXT:    s_nop 0
 ; GFX942-GISEL-NEXT:    v_mov_b32_e32 v0, v1
 ; GFX942-GISEL-NEXT:    v_pk_add_f32 v[0:1], v[0:1], 0

arsenm · 2025-06-25T12:35:22Z

I have a similar patch somewhere but I think we need to fix #140608 first

#140878 also related

arsenm · 2025-06-26T08:01:20Z

llvm/test/CodeGen/AMDGPU/packed-fp32.ll

 ; GFX90A-GISEL-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX90A-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], v[0:1]
+; GFX90A-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], 0


The only test change is one globalisel case, so that's a hint this is just hiding a missed optimization that should have happened earlier in the pipeline. Do you have another example where this is useful?

Discussed elsewhere.

kosarev

Rebased.

kosarev · 2025-06-27T10:19:30Z

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

      MachineInstr *RSUseMI = RSUse->getParent();
      unsigned OpNo = RSUseMI->getOperandNo(RSUse);

      if (SplatRC) {
+        if (RSUseMI->isCopy()) {


I guess this could be isFoldableCopy(), but I hesitate to use it without test coverage.

kosarev · 2025-06-27T10:19:42Z

llvm/test/CodeGen/AMDGPU/packed-fp32.ll

 ; GFX90A-GISEL-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX90A-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], v[0:1]
+; GFX90A-GISEL-NEXT:    v_pk_add_f32 v[0:1], s[0:1], 0


Discussed elsewhere.

kosarev requested review from jayfoad and arsenm June 25, 2025 12:32

llvmbot added the backend:AMDGPU label Jun 25, 2025

kosarev changed the title ~~[AMDGPU][GFX13] Fold into uses of splat REG_SEQUENCEs through COPYs.~~ [AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs. Jun 25, 2025

kosarev force-pushed the splat-copies branch from 31915a6 to c8c8455 Compare June 25, 2025 12:33

arsenm reviewed Jun 26, 2025

View reviewed changes

[AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs.

9300f4e

kosarev force-pushed the splat-copies branch from c8c8455 to 9300f4e Compare June 27, 2025 10:15

kosarev commented Jun 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs. #145691

[AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs. #145691

kosarev commented Jun 25, 2025

Uh oh!

llvmbot commented Jun 25, 2025

Uh oh!

arsenm commented Jun 25, 2025 •

edited

Loading

Uh oh!

arsenm Jun 26, 2025

Uh oh!

kosarev Jun 27, 2025

Uh oh!

kosarev left a comment

Uh oh!

kosarev Jun 27, 2025

Uh oh!

kosarev Jun 27, 2025

Uh oh!

Uh oh!

[AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs. #145691

Are you sure you want to change the base?

[AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs. #145691

Conversation

kosarev commented Jun 25, 2025

Uh oh!

llvmbot commented Jun 25, 2025

Uh oh!

arsenm commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

kosarev Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

kosarev left a comment

Choose a reason for hiding this comment

Uh oh!

kosarev Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

kosarev Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arsenm commented Jun 25, 2025 •

edited

Loading